Annotated Hungarian National Corpus

نویسندگان

  • Zoltán Alexin
  • János Csirik
  • Tibor Gyimóthy
  • Károly Bibok
  • Csaba Hatvani
  • Gábor Prószéky
  • László Tihanyi
چکیده

Zoltan Alexin Department of Informatics University of Szeged [email protected]—szeged.hu Tibor Gyinnithy Research Group on Artifical Intelligence at University of Szeged [email protected]—szeged.hu Csaba Hatvani Department of Informatics University of Szeged [email protected]—szeged.hu LaszlO Tihanyi MorphoLogic Budapest [email protected] Janos Csirik Department of Informatics University of Szeged [email protected]—szeged.hu Karoly Bibok Slavic Institute University of Szeged [email protected]—szeged.hu Gabor PrOszeky MorphoLogic Budapest [email protected]

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Hungarian Sentiment Corpus Manually Annotated at Aspect Level

In this paper we present a Hungarian sentiment corpus manually annotated at aspect level. Our corpus consists of Hungarian opinion texts written about different types of products. The main aim of creating the corpus was to produce an appropriate database providing possibilities for developing text mining software tools. The corpus is a unique Hungarian database: to the best of our knowledge, no...

متن کامل

The Szeged Corpus: A POS Tagged and Syntactically Annotated Hungarian Natural Language Corpus

The Szeged Corpus is a manually annotated natural language corpus currently comprising 1.2 million word entries, 145 thousand different word forms, and an additional 225 thousand punctuation marks. With this, it is the largest manually processed Hungarian textual database that serves as a reference material for research in natural language processing as well as a learning database for machine l...

متن کامل

Light Verb Constructions in the SzegedParalellFX English-Hungarian Parallel Corpus

In this paper, we describe the first English–Hungarian parallel corpus annotated for light verb constructions, which contains 14,261 sentence alignment units. Annotation principles and statistical data on the corpus are also provided, and English and Hungarian data are contrasted. On the basis of corpus data, a database containing pairs of English–Hungarian light verb constructions has been cre...

متن کامل

Manually Annotated Hungarian Corpus

Current paper presents the results of a two-year project during which a consortium of the University of Szeged and the MorphoLogic Ltd. Budapest developed a morpho-syntactically parsed and annotated (disambiguated) corpus for Hungarian. For morpho-syntactic encoding, the Hungarian version of MSD (MorphoSyntactic Description) has been used. The corpus contains texts of five different topic areas...

متن کامل

Hungarian Corpus of Light Verb Constructions

The precise identification of light verb constructions is crucial for the successful functioning of several NLP applications. In order to facilitate the development of an algorithm that is capable of recognizing them, a manually annotated corpus of light verb constructions has been built for Hungarian. Basic annotation guidelines and statistical data on the corpus are also presented in the pape...

متن کامل

Morphological annotation of Old and Middle Hungarian corpora

In our paper, we present a computational morphology for Old and Middle Hungarian used in two research projects that aim at creating morphologically annotated corpora of Old and Middle Hungarian. In addition, we present the web-based disambiguation tool used in the semi-automatic disambiguation of the annotations and the structured corpus query tool that has a unique but very useful feature of m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003